Clickstream Clustering using Weighted Longest Common Subsequences

نویسندگان

  • Arindam Banerjee
  • Joydeep Ghosh
چکیده

Categorizing visitors based on their interactions with a website is a key problem in web usage mining. The clickstreams generated by various users often follow distinct patterns, the knowledge of which may help in providing customized content. In this paper, we propose a novel and effective algorithm for clustering webusers based on a function of the longest common subsequence of their clickstreams that takes into account both the trajectory taken through a website and the time spent at each page. Results are presented on weblogs of www.sulekha.com to illustrate the techniques. keywords : web usage mining, clickstream, subsequence, clustering.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multiobjective Approach to the Weighted Longest Common Subsequence Problem

Finding the Longest Common Subsequence in Weighted Sequences (WLCS) is an important problem in computational biology and bioinformatics. In this paper, we model this problem as a multiobjective optimization problem. As a result, we propose a novel and efficient algorithm that not only finds a WLCS but also the set of all possible solutions. The time complexity of the algorithm depends primarily...

متن کامل

A Recommender System Approach for Classifying User Navigation Patterns Using Longest Common Subsequence Algorithm

Prediction of user future movements and intentions based on the users’ clickstream data is a main challenging problem in Web based recommendation systems. Web usage mining based on the users’ clickstream data has become the subject of exhaustive research, as its potential for web based personalized services, predicting user near future intentions, adaptive Web sites and customer profiling is re...

متن کامل

A Greedy Approach for Computing Longest Common Subsequences

This paper presents an algorithm for computing Longest Common Subsequences for two sequences. Given two strings X and Y of length m and n, we present a greedy algorithm, which requires O(n log s) preprocessing time, where s is distinct symbols appearing in string Y and O(m) time to determines Longest Common Subsequences.

متن کامل

DISCOVERY of LONGEST INCREASING SUBSEQUENCES and its VARIANTS using DNA OPERATIONS

The Longest Increasing Subsequence (LIS) and Common Longest Increasing Subsequence (CLIS) have their importance in many data mining applications. We propose algorithms to discover LIS and CLIS from varied databases. This work finds all increasing subsequences from the given database, find increasing subsequences in n sliding window, longest increasing sequences in one and more sequences, decrea...

متن کامل

Computing the Number of Longest Common Subsequences

This note provides very simple, efficient algorithms for computing the number of distinct longest common subsequences of two input strings and for computing the number of LCS embeddings.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001